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ABSTRACT 

We  develop  estimates  for  the  parameters  of  the  Diriehlet-multinomial 
distribution  (DMD)  when  there  is  insufficient  data  to  obtain  maximum  like¬ 
lihood  or  method  of  moment  estimates  known  in  the  literature.  We  do, 
however,  have  supplemetary  beta-binomial  data  pertaining  to  the  marginals 
of  the  DMD,  and  use  these  data  when  estimating  the  DMD  parameters.  A 
real  situation  and  data  set  are  given  where  our  estimates  are  applicable. 

1.  INTRODUCTION 


Suppose  we  have  t  +  1  mutually  exclusive  events  and  Yy  is  the  number 
of  times  that  event  j  occurs  out  of  k  independent  trials,  j  =  0, 1, . . .  ,f.  Let 
Y ,  conditional  on  the  vector  of  probabilities  it  =  tt,  have  a  multinomial 
distribution,  i.e.,  F|n  =  (Y0, . . .  ,Yt)  |lt  =  n  ~  multinomial(fc,7r0, . . . ,  7rt). 
Let  it  have  a  Dirichlet  distribution;  then  compounding  the  multinomial 
distribution  with  the  Dirichlet  gives  the  so-called  Dirichlet  (or  0-)  com¬ 
pound  multinomial  distribution  (Johnson  and  Kotz  1969),  also  known  as 
the  compound  multinomial  distribution  (Mosimann  1962).  It  is  commonly 
known  as  the  Diriehlet-multinomial  distribution,  denoted  by  DMD(fc,r,A), 
T  >  0,  Ay  >  0,  j  =  0,1,...,  *,  £>=0AJ  =  1. 


I 

t 

I 

t 

!  An  excellent  literature  review  of  the  parameter  estimation  and  appli¬ 

cations  of  the  DMD  was  given  by  Chuang  and  Cox  (1985),  although  they 
did  not  mention  the  application  of  the  DMD  to  magazine  and  TV  exposure 
data  (Chandon  1976;  Leckenby  and  Kishi  1984;  Rust  and  Leone  1984),  an 
application  we  will  give  in  Section  4. 

I  The  DMD  mass  function  is 


!  /“(K  =  y) 


i 


I 


k\ _ r(r)  r(fc  -  £‘=i  yy  +  tXq) 

(*  -  E‘=. «)<  r(r  +  *)  r(rXo)  x 


rT  r(yy  +  T^i) 
£  r(rAy)yy! 


o  <  Vi  <  k,  j  =  1, . . .  ,t,  yy  <  k, 

3  =  1 

(1.1) 


where  T(l)  =  (/  —  1)T(/  -  1),  the  usual  gamma  function. 

To  fix  ideas  we  will  set  t  =  3.  The  data  needed  to  estimate  Ay  ,  j  = 
0,1, 2, 3  is  (nio,n«i, «»2»«»3)>  *  =  1, ...,n,  where  nyy  is  the  number  of  oc¬ 
curences  of  event  j  for  the  ith  person  and  n  is  the  sample  size  (53y=0  n*i  = 
k ,  V  *).  Denote  the  total  number  of  people  in  the  sample  who  fall  into 
category  j  as  n.y  =  niy. 

Chuang  and  Cox  (1985)  estimated  Ay  with 


(1.2) 


where  n.y  =  n.y/n.  We  still  need  to  find  an  estimate  of  r,  for  which  we  now 
give  four  different  estimates  which  have  appeared  in  the  literature. 

Mosimann  (1962)  showed  that  the  covariance  matrix  of  n  (denoted  E^ 
)  and  the  covariance  matrix  of  Y  (denoted  Ej«)  are  related  thus, 


E,  =  ^E„  .  (1.3) 

He  suggested  estimating  Efl  with  n.y(fc  -  fl.j)/k  on  the  diagonal  and 
-n.jfi.j'/k  ,  j  ^  j',  on  the  off-diagonal  and  estimating  with  5Zr=i(n*j  ~ 
n.y)2/(n  -  1)  on  the  diagonal  and  £"=l(n»y  -  ft.y)(n»y*  -  fty')/(n  “  1)  on 
the  off-  diagonal.  Notice  that  E^  and  E^  are  nonsingular  3x3  matrices 
(Mosimann  1962).  Then,  using  (1.3), 

\det(£,,)/  ’ 


k  +  T 

1  +  f 


i 

from  which  t  can  be  obtained. 

Brier’s  (1980)  estimate  of  t  similarly  comes  from  solving  for  t  in  the 
following  equation 

k  +  T  1  r— v  (n,-y  —  n.y)^ 

ITT  ~  3 (n  -  1)  2^2^  n ' 

'  '  »=1 j=0  3 


Both  Mosimann’s  and  Brier’s  estimates  are  based  on  the  method  of  moments 
estimation  technique.  Owing  to  the  form  of  Brier’s  estimate  Chuang  and  Cox 
(1985)  called  it  a  chi-square  moment  estimate. 

The  likelihood  equations  used  to  find  Ay  and  f  are 

n  ntJ  - 1  n  »,y-l  . 

^  £  r+Aj =  ^  /Ta^  * J  =  1’2,3, 

»=l  f=0  J  t=l  1=0  0 


Ao  —  1  —  Ay  , 

J=1 

n  3  n,j  —  1 

EE  E  j 

*= i j=o  <=o 


Ay 


+  rAy 


»E 


/=o 


1 

/  +  r 


(1.4) 


Due  to  the  numerical  difficulties  of  obtaining  a  solution  to  the  likelihood 
equations  of  (1.4)  Chuang  and  Cox  (1985)  estimated  t  using  the  pseudo 
maximum  likelihood  method  of  Gong  and  Sameniego  (1981).  Chuang  and 
Cox’s  method  is  to  substitute  the  Ay  of  (1.2)  into  (1.1)  then  obtain  the 
likelihood  equation  which  involves  just  the  parameter  r.  Their  likelihood 
equation  is 


n  3  n,j  —  l 

EE  E 


»=iy=o  <=o 


l  +  n.yr 


fc-i 

-E 


<=o 


1 

l  +  T  ’ 


However,  if  k  —  1,  we  can  use  neither  Mosimann’s  nor  Brier’s  estimate  of 
t  since  ( k  +  r)/(  1  +  r)  is  1  when  k  =  1.  In  addition,  the  maximum  likelihood 
and  pseudo  maximum  likelihood  methods  do  not  give  unique  solutions  when 
k  =  1  as  there  are  only  three  linearly  independent  data  and  four  parameters 
to  estimate.  It  is  precisely  when  k  =  1  that  we  desire  to  estimate  r.  A  reason 
for  this  will  be  apparent  in  Section  4. 


2.  ESTIMATING  r  WHEN  k  =  1 


Let  Xi  =  Ei  +  V3  and  X2  =  Y2  +  Yy,  then  {Xi,X2)  is  the  bivariate 
distribution  of  the  total  number  of  occurences  of  events  1  and  3  and  events  2 


is 


and  3,  respectively.  The  marginal  distribution  of  each  of  the  Y}-  is  the  beta- 
binomial  distribution  (BBD)  denoted  by  BBD(k,T\Jtr(l  —  Ay)),  whose  mass 
function  is  obtained  by  letting  t  =  1  in  (1.1). 

An  application  of  some  general  DMD  theorems  in  Basu  and  de  B.  Pereira 
(1982)  shows  that 

X!  ~  BBD(k,r(Xl  +  A3),r(A0  +  A*)), 

X2  ~  BBD{k,r{ A2  +  A3),r(A0  +  AJ).  ^ 

The  joint  mass  function  of  Xi  and  X2  is 


*!r(r) 


k'r(r)  min{*  i’**} 

g(Xl=xi,Xi  =  x2)=  — LL  £ 

1  ii=mai{0,*i+ij-lt} 

r(xt  -  x3  +  t Ax)r(i2  ~  x3  +  rA2)T(x3  +  rA3)r(A  j-  x3  -  Xi  -  x2  -f  rA0) 
(xi  -  x3)!(x2  -  x3)\x3\{k  +  x3  -  xx  -  x2)!  n?=o  r(r^») 

(2.2) 

0  <  Xi  <  k,  i  =  1,2. 

We  saw  in  Section  1  that  to  estimate  r  when  k  —  1  we  need  some  extra 
data.  From  (2.1),  Xx  ~  BBD(k,r(Xi  +  A3),r(Ao  +  A2))  so  we  can  estimate 
r(Ax  +  A3)  and  r(A<j  +  A2)  using  supplementary  data  pertaining  to  Xu  if  such 
data  is  available;  similarly  for  X2.  Define  a »  and  ft,  i  =  1,2,  as  follows; 

r(Ai  +  A3)  =  ai  ,  r(A0  +  A2)  =  ft  , 

(2.3) 

r(A2  -t-  A3)  =  a2  ,  r(Ao  +  Ai)  =  ft  . 

From  (2.3),  a^  +  ft  =  r(Ao  +  Ai  +  A2  +  A3)  =  r,  i  =  1,2.  This  means  that 
when  ^  and  ft  are  estimated  using  supplementary  BBD  data  the  estimates 
should  be  constrained  so  that 

<*i  +  ft  —  «2  +  ft  =  t.  (2.4) 

The  problem  with  trying  to  use  constrant  (2.4)  is  that  t  is  unknown.  Chan- 
don  (1976)  could  not,  so  did  not,  apply  constraint  (2.4)  when  estimating 

A  A  A 

ai  and  ft.  As  a  result,  ft  -I-  ft  ^  ft  +  ft  where  ft  and  ft  are  MLEs  or 
method  of  moment  estimates  obtained  by  using  supplementary  BBD  data 
for  Xi,  i  =  1,2.  Knowing  this,  he  took  a  weighted  average  of  ft  +  ft  and 
ft  +  ft  to  estimate  r  with 


Ea  Wi  ou 

(ft  +  ft) - - - ,  where  u/»  =  - - j-  ,  *  =  1,2. 

i_1  Wi+W2  OLi  +  ft 


We  found  this  unappealing  since  this  estimator  of  r  is  rather  ad  hoc.  He 
could  equally  well  have  chosen  the  arithmetic,  geometric,  or  harmonic  mean 
of  (ft  +  ft)  i  =  1,2. 

Our  procedure  is  as  follows.  Denote  the  correlation  between  Xi  and  X 2 
as  px i.Xj-  Then  (2.3)  substituted  into  PXi,x2  gives 


PXUX2  = 


A()A3  Ax  A2 

/(Ai  +  A3)(A0  +  A2)  (^2  +  A3)(Aq  +  Ax) 

1  ( A0A3  —  A1A2) 

V &lPla202 


Solving  for  t  in  (2.6)  gives 


=  / _ ai/?1a2/?2 _ 

\  (A  1  +  A3)(Aq  +  A2)(A2  +  A3)(Aq  +  A 


o)1 


From  (2.3)  it  follows  that 


Ax  +  A3  — - ,  A0  +  A2  —  — , 

cti+Pi  a  i  +  ft 


A2  +  A’  =  ^’  Ao+A'  =  ^- 


Substituting  the  four  equations  of  (2.8)  into  (2.7)  gives 

t  =  y/ (ai  +  Pi )  (a2  +  P2)  •  (2.9) 

The  above  construction  shows  that  it  is  more  reasonable  to  estimate  r  with 
the  geometric  mean  of  ft  +  ft,  he.,  fgm  =  ^ (ft  +  ft)(d2  +  /?2),  rather  than 
the  weighted  average  estimate,  tc. 


3.  ASYMPTOTIC  PROPERTIES  OF  t 


If  ai  and  Pi  are  estimated  with  consistent  estimates  then,  as  n  — +■  00, 

.  «i  +  “2 

T  ,  ■—  1  1  ■ 

c  ai  1  ai  ’ 

ai+^l  a  j+/?j 

which  equals  r  iff  (2.4)  holds.  On  the  other  hand,  fgm  — ►  r  iff  (2.9)  holds. 
Since  (2.4)=>(2.9),  but  the  converse  is  not  true,  fgm  is  consistent  under  a 
weaker  assumption  than  that  required  to  make  fc  consistent. 


We  can  compare  the  asymptotic  relative  efficiency  (ARE)  of  fc  and 
Tgm  by  examining  the  ratio  of  their  asymptotic  variances.  Define  ARE  - 
AV (tc) / AV (fgm)  where  AV (•)  denotes  asymptotic  variance.  The  AV  of  the 
two  competing  estimates  can  be  obtained  from  knowledge  of  the  asymptotic 

A  A 

joint  distribution  of  and  use  of  the  delta  method. 

If  the  a/s  and  /?»’ s  are  MLEs  then  the  asymptotic  joint  distribution  can 
be  obtained  from  the  from  general  MLE  theory  (Lehmann  1983),  i.e., 

v/n(0-0)  -  MVN(O,I~l(0))  ,  as  n  — ►  oo  ,  (3.1) 


No¬ 


where  6  '  =  (a!,/?x,a25/?2)  and  I(§)  is  the  information  matrix.  Danaher 
(1987)  proved  that  the  regularity  conditions  for  (3.1)  to  be  true  are  satisfied 
for  the  MLEs  of  the  BBD  parameters. 

Denote  the  information  matrix  of  (ai,0i)  as  /(«*,,&),  and  the  mass 
function  of  Xi  ~  BBD(k,ai,0i)  as  f^BDt  *  =  1,2.  Then 


/(*.-,&)  = 


&(*i,Xi)flBD-  A{ai  +  0itk)  ,-A («<+&,*) 


x,  =  l 


k-l 


-  A(a,  +  0i,k)  ,  £  A [0itk  -  Xi)fBBD  -  A (Oi  +  0t,k) 


Xi=  0 


/(#)  = 


where  A (%/)  =  l/("r  +»2. 

If  it  is  assumed  that  the  bivariate  distribution  (dri,/?i)  is  independent 
of  ((*2,02)  then 

/(<*i,0i)  0 

0  I(a  2,/?2)J 

To  ensure  that  Lehmann’s  (1983,  p345)  definition  of  asymptotic  relative 
efficiency  is  well  defined,  we  must  assume  that  (2.4)  is  true  when  comparing 
the  asymptotic  variances  of  ?c  and  fgm.  Use  of  the  delta  method  and  (2.4) 
gives 

are  =  i:r\S)xcKx'mr\S)x,m) , 

where  xc'  =  and  =  £(1,1, 1,1). 

Clearly,  when  ai  =  012  the  ARE  =  1.  Some  AREs  for  selected  aP s  and 
Pi's  are  given  in  Table  I.  The  table  shows  that  the  ARE  is  greater  than  one 
for  three  of  the  four  cases  considered.  An  interesting  observation  is  that  for 
given  ai  and  Pi  the  ARE  does  not  vary  much  with  k.  Due  to  the  compexity 


‘‘■■ft* 


Table  I:  ARE  comparison  of  tc  and  fgm  for  some  o^’s  and  /?»’ s. 


a,  and  Pi 

=  1  Pi  =  2 

at  =  0.5  Pi  =  1  ai  =  0.1  pi  =  0.4 

ai  =  10  Pi  =  5 

k 

0-2  —  2  P2  —  1 

a2  =  0.2  p2  =  1.3  =  0.3  /?3  =  0.3 

OL2  —  3  P2  =  12 

2 

1.11 

0.93 

1.02 

1.24 

4 

1.11 

0.92 

1.01 

1.24 

8 

1.11 

0.91 

1.01 

1.24 

of  the  information  matrix  the  author  was  unable  to  find  conditions  on  and 
Pi  under  which  ARE  >  1.  Hence,  to  check  the  conjecture  that  ARE  >  1 
most  of  the  time,  two  hundred  randomly  chosen  a/s  and  Pi's  were  selected 
to  conform  to  (2.4)  and  the  ARE  calculated.  Some  144/200  =72%  of  the 
cases  had  ARE  >  1.  Hence  fgm  is  asymptotically  more  efficient  than  tc  for 
approximately  three-quarters  of  the  possible  oti  and  Pi  which  satisfy  (2.4). 

4,  , APPLICATIONS 

Suppose  an  advertiser  is  about  to  launch  an  advertising  campaign  by 
placing  k  ads  in  each  of  two  different  magazines.  To  evaluate  the  effectiveness 
of  the  campaign  the  advertiser  would  like  to  estimate  the  proportion  of  the 
population  which  sees  at  least  one  of  the  ads  (known  as  the  reach).  We  say 
that  a  person  is  exposed  to  am  ad  when  he  or  she  sees  the  ad.  Let  Yi  =  the 
number  of  exposures  exclusive  to  magazine  1,  Y2  =  the  number  of  exposures 
exclusive  to  magazine  2,  and  F3  =  the  number  of  exposures  to  both  magazines 
1  and  2,  for  a  particular  person  with  0  <  Yi  <  k  ,  t  =  1,2,3.  Then  X,-,  as 
defined  in  Section  2,  is  the  number  of  exposures  a  person  has  to  magazine 
t,  t  =  1,2.  Chandon  (1976)  modelled  Y  with  the  DMD  and  the  exposure 
distribution  for  a  single  magazine  (X,  here)  was  first  modelled  with  the  BBD 
by  Metheringham  (1964).  The  DMD  has  also  been  used  to  model  combined 
TV  and  magazine  exposure  data  (Rust  and  Leone  1984). 

In  the  media  survey  we  used  for  our  data  two  questions  were  asked  of 
the  respondents  (for  weekly  magazines); 

Ql)  “Have  you  personally  read  or  looked  into  any  issue  of  ...  (magazine 
name)  in  the  last  seven  days  -  it  doesn’t  matter  where?”  (Has  a  Y/N  answer). 


Q2)  “How  many  different  issues  of  ...(magazine  name),  if  any,  do  you 
personally  read  or  look  into  in  an  average  month  -  it  doesn’t  matter  where?” 
(Has  answer  0,1, 2, 3, 4  issues). 

The  wording  of  Ql  and  Q2  are  modified  appropriately  for  two- weekly, 
monthly  and  two-monthly  magazines.  These  questions  were  asked  for  forty 
different  magazines. 

An  implicit  assumption  in  the  magazine  advertising  field  is  that  a  person 
who  reads  a  magazine  is  exposed  to  all  the  advertisements  in  that  magazine. 
This  is  unlikely  to  be  true  for  people  who  meet  the  criterion  of  “read”  in 
Ql  and  Q2.  However  it  is  usually  impractical  to  ask  respondents  which 
advertisements  they  have  been  exposed  to  so  we  cannot  avoid  making  this 
assumption  for  the  available  data. 

There  are  many  media  schedules  an  advertising  agency  can  specify  whose 
exposure  distribution  cannot  be  directly  estimated  from  Ql  and  Q2.  For 
example,  a  schedule  with  3  ads  in  each  of  two  different  magazines  cannot 
be  estimated  using  Ql  and  Q2.  We  want,  therefore,  to  construct  a  model 
which  not  only  estimates  observable  exposure  distributions  accurately  but 
can  be  used  to  estimate  (or  predict)  exposure  distributions  outside  the  range 
of  exposure  distributions  covered  by  Ql  and  Q2. 

The  first  step  in  using  (1.1)  to  model  the  exposure  distribution  of  Y  is  to 
estimate  the  parameters  of  the  DMD.  When  solving  the  likelihood  equations 
we  come  up  against  a  data  problem.  The  response  to  Q2  does  not  tell  us 
what  a  person’s  reading  behavior  was  in  a  specified  week,  it  only  gives  us  the 
total  number  of  issues  read  in  the  last  four  weeks.  If  a  person’s  response  to 
Q2  is  “4”  then  clearly  they  saw  each  issue  but  if  their  response  is  a  “2”  we 
have  no  way  of  knowing  which  two  issues  were  read.  Hence  Q2  cannot  give 
us  the  data  reqired  to  fit  (l.l).  We  can,  however,  use  Ql  to  fit  (1.1)  because 
here  a  Yes/No  response  tells  us  precisely  whether  or  not  a  person  read  the 
last  issue  of  a  magazine.  The  problem  this  time  is  that  k  =  1  when  using 
Ql  to  fit  the  DMD  and  we  saw  in  Section  1  that  the  conventional  methods 
for  estimating  r  do  not  work  wken  k  =  1.  We  can,  therefore,  use  (2.9)  to 
estimate  r  in  the  following  way. 

The  response  to  Ql  gives  us  data  (n<o,rcji,n*2,n»3)>  X)yn»/  =  n<j  e 
{0, 1}  V  i.  These  data  can  be  used  to  estimate  Ay  using  (1.2).  The  response 
to  Q2  gives  us  data  {ti*,},  *  =  1,2  where  nXi  is  the  number  of  people  in  the 
sample  who  have  z,-  exposures  to  magazine  t,  0  <  z<  <  4,  »  =  1,2.  Hence  we 


Table  II:  Observed  Exposure  Distributions  and  Parameter  Estimates  for  the 
New  Zealand  Listener  ( NZL )  and  Time  Magazine ;  n  =  5201. 


Observed  Univariate  Exposure  Distribution 

Exposures 

0  1 

2 

3 

4 

NZL 

2741  322 

286 

94 

1758 

Time 

4373  301 

186 

54 

287 

Bivariate 

(ti.o.n^n^n.j)  =  (2975,1741,189,296) 

Data 

where  n  y  =  £\  nty 

Parameter 

at  =  0.0743 

/?i  =  0.1103 

fc  =  0.2473 

Estimates 

a2  =  0.0498 

/?2  =  0.4517 

Tgrn  =  0.3043 

can  use  {n^}  to  get  estimates  of  and  /?»•  in  (2.3)  by  using  the  method  of 
moments  or  maximum  likelihood  estimation  for  the  BBD. 

Once  the  parameters  of  (2.2)  have  been  estimated  we  can  estimate  (or 
predict)  the  mass  function  of  (Xi,X2)  for  values  of  k  other  than  1  or  4,  the 
values  available  from  the  data. 

In  Table  II  we  give  the  observed  univariate  and  bivariate  exposure  dis¬ 
tributions  for  the  New  Zealand  Listener  and  Time  Magazine  along  with  the 
MLEs  for  a»  and  /?».  Since  (2.4)  does  not  hold  (even  approximately)  for  the 
parameter  estimates  we  cannot  use  our  derived  form  of  the  ARE  to  compare 
the  asymptotic  efficiency  of  rc  and  fgm. 

The  exposure  distribution  of  interest  to  advertisers  is  not  the  bivariate 
exposure  distibution  (Xi.Xj),  but  rather  Xtot  =  Xj  +  X2,  i.e.,  the  total 
number  of  exposures  a  person  has  to  the  ad  campaign.  Having  estimated  the 
parameters  of  (2.2)  we  obtain  an  estimate  of  the  probability  mass  function  of 
Xtot  by  a  change  of  variables.  If  we  denote  the  mass  function  of  the  exposure 
distribution  as  f{Xtot )  then  reach  is  1  —  f{Xtat  =  0)-  The  observed  reach  is 
52.5%  while  the  estimated  reachs  using  fgm  and  tc  are,  respectively,  53.1% 
and  51.5%.  Hence  use  of  fgm  in  (2.2)  gives  a  closer  estimate  of  reach  than 
when  tc  is  used. 

To  further  demonstrate  the  usefulness  of  the  geometric  mean  estimate 
of  t  we  consider  the  data  from  Mosimann  (1962).  These  data  are  frequencies 
of  occurrence  of  different  pollen  grains  made  at  n  =  73  different  core  levels. 
Here  the  pollen  counts  totalled  100  at  each  core,  i.e.,  k  =  100  so  there 
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Table  III:  Six  Estimates  of  r  for  Mosimann’s  (1962)  pollen  data. 


Mosimann’s 

Brier’s 

pseudo-MLE 

MLE 

Tgm 

?c 

81.92 

73.21 

62.97 

60.19 

57.76 

54.23 

is  no  need  to  use  fgm  as  the  four  methods  outlined  in  Section  1  are  all 
applicable,  with  maximum  likelihood  being  the  best  (Chuang  and  Cox  1985). 
Nonetheless,  we  will  estimate  t  with  fgm  to  show  that  it  competes  admirably 
with  the  four  estimates  in  Section  1. 

Let  Y0  =pine,  Y\  =oak,  Y2  =alder  and  Y$  =fir  pollen  counts  (cf.  Mosi- 
mann  (1962)  for  details  of  these  data). 

Table  III  gives  the  estimates  of  r  using  the  four  techniques  in  Section 
1  as  well  as  for  fgrn  and  fc  when  a +  and  ft  are  estimated  by  maximum 
likelihood.  The  estimate  which  is  closest  to  the  MLE  is  fgm,  even  closer  to 
the  MLE  than  the  pseudo-MLE.  Chuang  and  Cox  (1985)  point  out  that  the 
pseudo-MLE  is  both  easier  to  calculate  and  asymptotically  comparable  to  the 
MLE.  Their  estimate  does  require  some  degree  of  programming,  however,  as 
do  the  MLEs  of  and  ft  used  to  calculate  fgm.  If  a*  and  ft  for  fgm  are 
estimated  by  the  method  of  moments  the  computations  required  can  easily  be 
conducted  on  a  calculator.  Estimating  a*  and  ft  by  the  method  of  moments 
for  Mosimann’s  data  gives  fgm  =  56.69.  This  estimate  is  still  quite  close  to 
the  MLE  estimate  of  t  in  Table  III  and  has  the  advantage  of  requiring  no 
programming  whatsoever. 
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